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Autothresholding of Noisy Images 

Field of the Invention 

This invention relates to automatizable methods of image processing which 
include a method of selecting an optimal threshold value (t Q ) for an image, where the 
optimal threshold value (t D ) is selected as a function of mean feature size data (S(t)) 
representing the mean feature size (S) of features detected in the image at each of a 
plurality of threshold values (t). The optimal threshold value (t 0 ) so determined maybe 
used to threshold the image to produce a low-noise binary image. 

Background of the Invention 

In general, it is known to derive a binary (two-tone) image from a multi-tone 
image by a process of "thresholding." In thresholding, pixels or picture elements of the 
image are assigned one of two tones (e.g., black or white), based on a determination of 
whether the initial tone of the pixel or picture element (e.g., a shade of gray) is above or 
below a chosen threshold (e.g., a chosen level of brightness). 

Co-pending applications USSN 09/422,584, filed October 21, 1999, and USSN 
09/422,535, filed October 21, 1999, disclose that features of interest within an image 
may be identified by thresholding. 

U.S. Pat. No. 5,832,1 11 concerns a method for thresholding a video signal to 
produce a binary segmented video signal. 

U.S. Pat. No. 6,199,986 concerns an automated method for measuring the 
wavefront aberration of an eye. 

John C. Russ, The Image Processing Handbook, 2d ed. (CRC Press 1995) at 
pages 394-96 and 416-18, describes feature detection in a digitized image. 



Summary of the Invention 

Briefly, the present invention provides an automatizable method of selecting an 
optimal threshold value (t G ) for an image comprising the steps of: obtaining an image; 
selecting a test segment of the image; determining the mean feature size (S) of features 
5 appearing in the test segment at each of a plurality of threshold values (t), so as to 
produce mean feature size data (S(t)); selecting a relevant subset of the mean feature 
size data (S(t)); and determining an optimal threshold value (t Q ) as a function of the 
relevant subset of the mean feature size data. Typically, the optimal threshold value (t 0 ) 
is determined to be a threshold value equal to or approximating a midpoint of the 
10 relevant subset of mean feature size data. More typically, the optimal threshold value 
M (t 0 ) is taken as the weighted average threshold value (t) appearing in the relevant subset 

=j3 of mean feature size data, weighted according to mean feature size (S(t)). Typically, the 

relevant subset is determined so as to exclude a low threshold saturation plateau in the 
■^3 mean feature size data. This may be accomplished by excluding values for which S(t) 

LH 15 is greater than a given percentage of the maximum value of S observed in the mean 
q feature size data (S ma x), which may be between 1% and 100% and is typically 1%. 

JS y Alternately, and more typically, counting data (N(t)) is produced by counting the 

* s% l number of features (N) appearing in the test segment of the image at each of a plurality 

Q 

of threshold values (t) and the relevant subset of the mean feature size data (S(t)) is 
20 selected as a function of the counting data (N(t)). Typically, a cutoff threshold value 
(t c ) is determined from the counting data (N(t)) and used to determine the relevant 
subset of the mean feature size data (S(t)). The relevant subset of the mean feature size 
data (S(t)) is selected by including all values for which t > t c or all values for which t > 
t c . Typically, the cutoff threshold value (t c ) is determined as a function of the counting 
25 data (N(t)), which function provides a cutoff threshold value (t c ) that falls in a transition 
range between an early noise peak of the counting data (N(t)) and a following low-slope 
region. Typically, the cutoff threshold value (t c ) is determined by a function t c = 2t P % - 
tQ%; wherein tp% is the lowest value of t for which N(t) has not exceeded P% of the 
maximum value of N observed in the counting data (N max ), wherein tQ% is the lowest 
30 value of t for which N(t) has not exceeded Q% of the maximum value of N observed in 
the counting data (N max ), wherein P% > Q%, and wherein P% and Q% are 
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independently selected from numbers between 1% and 100%. More typically, P% is 
selected from numbers between 50% and 95% and Q% is selected from numbers 
between 5% and 95%. 

In another aspect, the present invention provides an automatizable method of 
5 thresholding an image including any of the methods disclosed herein for choosing an 
optimal threshold value (t G ) for the image. 

In another aspect, the present invention provides a system for determining an 
optimal threshold value (t G ) for an image, comprising: a) an image device for providing 
a digitized image; b) a data storage device; and c) a central processing unit for receiving 
10 the digitized image from the image device and which can write to and read from the 
W data storage device, where the central processing unit is programmed to perform a 

kQ method disclosed herein for selecting an optimal threshold value (t G ) for the image. 

L I In another aspect, the present invention provides a system for thresholding an 

j image, comprising: a) an image device for providing a digitized image; b) a data storage 

\J\ 15 device; and c) a central processing unit for receiving the digitized image from the image 

!==j device and which can write to and read from the data storage device, where the central 

SI 



processing unit is programmed to perform a method disclosed herein for thresholding 
an image. 

In another aspect, the present invention provides data storage media having 
20 recorded thereon software that upon installation in a computer and execution of the 

software will cause the computer to carry out a method disclosed herein for selecting an 
optimal threshold value (t G ) for the image. 

In another aspect, the present invention provides data storage media having 
recorded thereon software that upon installation in a computer and execution of the 
25 software will cause the computer to carry out a method disclosed herein for 
thresholding an image. 

What has not been described in the art, and is provided by the present invention, 
is an automatizable method to select an optimal threshold value (t 0 ) for an image which 
minimizes noise artifacts by choosing t Q as a function of mean feature size data (S(t)). 
30 In this application: 
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"Binary image" means an image in two tones or values, e.g. black and white, 
essentially without any intermediate values, e.g. gray shades. In the case of a digital 
image, all intermediate values are typically excluded, whereas total exclusion of 
intermediate values may not be possible in a photochemical image. 
5 "Feature" means a group of contiguous pixels or picture elements having similar 

values. Contiguous pixels or picture elements may be defined in any suitable manner. 
Contiguous pixels or picture elements may include only those sharing an edge (four- 
connected rule) or may more typically include those sharing an edge or a corner (eight- 
connected rule). Alternately, contiguous pixels or picture elements may be defined as 
10 those which can be connected by a straight line segment that does not pass through any 
3 third pixel or picture element. For a binary image, pixels or picture elements having 

3 similar values are pixels or picture elements having essentially the same value. 

j Typically, one value is designated a background value and "features" of that value are 

2 excluded. 

fj 15 "Test segment" means, with regard to an image, a portion of the image chosen 

~j to be tested, which may include up to 100% of the image. 

*j "Mean feature size" means the average size of a collection of features, typically 

4 expressed as area size, number of pixels or picture elements, or similar measures. 
3 

a It is an advantage of the present invention to provide a method of thresholding 

20 an image which may be automated, which excludes noise, and which provides accurate 
feature detection. 



Brief Description of the Drawing 

Fig. 1 is a schematic illustration of a prototypical scanning system with which 
25 the present invention might be used. 

Fig. 2 is a simulated noisy image in 256 shades of gray, of the sort that might be 
captured by the scanning system of Fig. 1 . 

Fig. 3 is a graph showing number of detected features (N(t)) vs. detection level 
threshold (t) (left scale) and mean feature size (S(t)) vs. detection level threshold (t) 
30 (right scale) obtained for the noisy image of Fig. 2 according to the method of the 
present invention. 
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Figs. 4-9 are binary images obtained by thresholding the noisy image of Fig. 2 at 
the following threshold levels: 35, 50, 75, 140, 205 and 225. Fig. 7 represents the 
product of the thresholding method according to the present invention. 

5 Detailed Description of Preferred Embodiments 

The present invention provides image processing methods that include a method 
of selecting an optimal threshold value (t G ) for an image comprising the steps of: 
obtaining an image; selecting a test segment of the said image; determining the mean 
feature size (S) of features appearing in the test segment at each of a plurality of 
10 threshold values (t), so as to produce mean feature size data (S(t)); selecting a relevant 
S3 subset of the mean feature size data (S(t)); and determining an optimal threshold value 

i\j (t c ) as a function of said subset of the mean feature size data. 

I e j The step of obtaining an image may be accomplished by any suitable method. 

«2 The image may be collected by use of a video camera, digital camera, photochemical 

tj] 15 camera, microscope, visual scanning system, probe scanning system, or other sensing 
=■=2 apparatus which produces data points in a two-dimensional array. Typically, the target 

image is expected to be essentially a binary image containing distinct features, which, 

; hi 

"J however, may additionally contain noise. If the image is not initially created in digital 

O 

\ s & form by the image capturing or creating equipment, the image is typically digitized into 

20 pixels. Typically, the methods described herein are accomplished with use of a central 
processing unit or computer. 

Fig. 1 illustrates a scanning system with which the present invention might be 
used. In the system of Fig. 1, a focused beam of light moves across an object and the 
system detects the resultant reflected or fluorescent light. To do this, light from a light 
25 source 10 is focused through source optics 12 and deflected by mirror 14 onto the 

object, shown here as a sample 3x4 assay plate 16. The light from the light source 10 
can be directed to different locations on the sample by changing the position of the 
mirror 14 using motor 24. Light that fluoresces or is reflected from sample 16 returns 
to detection optics 18 via mirror 15, which typically is a half silvered mirror. 
30 Alternatively, the light source can be applied centrally, and the emitted or fluoresced 

light can be detected from the side of the system, as shown in US 5,900,949, or the light 
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source can be applied from the side of the system and the emitted or fluoresced light 
can be detected centrally, or any other similar variation. Light passing through 
detection optics 18 is detected using any suitable image capture system 20, such as a 
television camera, CCD, laser reflective system, photomultiplier tube, avalanche 
5 photodiode, photodiodes or single photon counting modules, the output from which is 
provided to a computer 22 programmed for analysis and to control the overall system. 
Computer 22 typically will include a central processing unit for executing programs and 
systems such as RAM, hard drives or the like for data storage. It will be understood 
that this description is for exemplary purposes only; the present invention can be used 
10 equally well with "simulated" images generated from magnetic or tactile sensors, not 
^ just with light-based images, and with any object to be examined, not just sample 16. 

y3 In the step of selecting a test segment of the image, typically a test segment will 

be sought which is representative of the entire image. A test segment may be selected 
':l3 for each new image separately or a fixed test segment area may be used for each of a 

\J\ 15 number of images. In an automated system, the test segment is typically predetermined 
□ and the same segment area is used for all images processed. The test segment may 

r „y contain up to 100% of the image. For ease of computation, the test segment may 

"J contain less than all of the image. If the test image is expected to contain particular 

features, the test segment typically contains at least one expected feature and more 
20 typically four or more. 

To produce mean feature size data (S(t)), the mean feature size (S) of features 
appearing in the test segment is determined at each of a plurality of threshold values (t). 
As noted above, "features" may be defined in a number of ways. Typically the eight- 
connected rule is used. The same definition is typically used throughout the process. 
25 Features touching the border of the test segment may be included or excluded. Features 
touching the border of the test segment that are included may be included in their entire 
size or only in the portion falling within the test segment. The same rule regarding 
features touching the border is typically used throughout the process. Typically, 
features touching the border of the test segment are included in the test segment but 
30 only in the portion falling within the test segment. Mean feature size data (S(t)) may be 
calculated for all of the available values of t; which may be 256 values in a digital 
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image. If fewer values are used, calculation time may be improved at the expense of 
precision in determining the best optimal threshold value (t 0 ). Typically, mean feature 
size data (S(t)) is calculated for every forth value of t in a digital image having 256 
possible values. 

Fig. 3 includes a graph of mean feature size (S(t)) vs. detection level threshold 
(t) (right scale) obtained for the simulated noisy image of Fig. 2. Data for selected 
threshold levels appears in Table I. 

Table I 



Fig. 


Threshold level 


Mean feature size 
(pixels) 


Number of features 


4 


35 


697.5 


176 


5 


50 


35.1 


1987 


6 


75 


187.3 


270 


7 


140 


192.9 


256 


8 


205 


131.2 


284 


9 


225 


6.4 


1341 



25 



Note that, for purposes of the present discussion and the present figures, features 
are represented as white images on a black background and lighter shades are 
represented by higher values of t. However, the opposite conventions may be employed 
by adjusting the terms as required, e.g., reversing "greater than" and "less than." 

Fig. 3 and Table I also include a graph of counting data, plotted as number of 
detected features (N(t)) vs. detection level threshold (t) (left scale of Fig. 3), obtained 
by counting the number of features (N) appearing in the image at each of a plurality of 
threshold values (t). For the calculations reflected in Fig. 3, the test segment was the 
entire image, a "feature" was defined using the eight-connected rule, and calculations 
were made for every fourth value of t. 

Figs. 4-9 are binary images obtained by thresholding the noisy image of Fig. 2 at 
the following threshold levels: 35, 50, 75, 140, 205 and 225, as indicated in Table I. 
Fig. 7 was obtained by thresholding the noisy image at an optimal threshold value (t G ) 
determined according to the present invention, and represents a binary image product of 
the thresholding method according to the present invention. 

Inspection of Figures 4-9 indicates that Fig. 7 represents the best defined and 
most noise-free binary image. Inspection of Table I indicates that the threshold 
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associated with Fig. 7 appears at a shallow local minimum of N(t) (number of features) 
and at a shallow local maximum of S(t) (size of features). These observations are 
consistent with the minimal noise pixels observed in both the background and the 
feature areas. In contrast, Fig. 6 demonstrates noise pixels in the background areas and 
5 Fig. 8 demonstrates noise pixels in the feature areas. 

In the method according to the present invention, the optimal threshold value 
(t 0 ) is determined to be a threshold value equal to or approximating a midpoint of the 
plateau region of the mean feature size data, which occurs in Fig. 3 between t=75 and 
t=205. Typically, the optimal threshold value (t 0 ) is taken as the weighted average 
10 threshold value (t) appearing in the mean feature size data, weighted according to mean 

O feature size (S(t)), after excluding the low threshold saturation plateau in the mean 

■n 

feature size data to obtain a relevant subset of the mean feature size data. The low 

"At? 

f*j threshold saturation plateau can be observed between t=0 and about t=50 in Fig. 3. 

Vlj Figs. 3 and 4 demonstrate that the image is "washed out" at these low threshold values 

\J\ 15 and the features are effectively joined to form one or few contiguous features, resulting 
i-j in a very high mean feature size. The relevant subset may be obtained by excluding 

mean feature size data values (S(t)) for which S(t) is greater than a given percentage of 
%8 %J the maximum value of S observed in the mean feature size data (S max ), which may be 

i a i between 1% and 100% and is typically 1%. Typically, a percentage is selected that is 

20 somewhat greater than the ratio of the size of features expected in the test segment of 
the image to the size of the entire test segment of the image, since the size of the 
maximum value of S is expected to approach the size of the entire test segment of the 
image. Alternately, a fixed value of S(t) may be selected as a cutoff value to eliminate 
data in the low threshold saturation plateau. 
25 More typically, a cutoff threshold value (t c ) is determined from the counting 

data (N(t)) and used to determine the relevant subset of the mean feature size data 
(S(t)). The relevant subset of the mean feature size data (S(t)) is selected by including 
all values for which t > t c or all values for which t > t c . Typically, the cutoff threshold 
value (t c ) is determined as a function of the counting data (N(t)), which function 
30 determines a cutoff threshold value (t c ) that falls in a transition range between the early 
noise peak and the plateau region. Typically, the cutoff threshold value (t c ) is 
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determined by a function t c = 2tp% - tQ%; wherein t P % is the lowest value of t for which 
N(t) has not exceeded P% of the maximum value of N observed in the counting data 
(N max ), wherein t Q% is the lowest value of t for which N(t) has not exceeded Q% of the 
maximum value of N observed in the counting data (N max ), wherein P% > Q%, and 
5 wherein P% and Q% are independently selected from numbers between 1% and 100%. 
More typically, P% is selected from numbers between 50% and 95% and Q% is 
selected from numbers between 5% and 95%. Most typically, P% = 95% and Q% = 
50%. 

After an optimal threshold value (t G ) for the image is determined, the image 
10 corresponding to thresholding at (t 0 ) may be output as data or as an image in any 
=3 suitable manner, including visible displays or printing processes and also including 

q 

?3 further data transforming processes, such as use of the output image as a mask for 

s .j further analysis of the original image input. 

^ Typically, the present method is automated by use of computer technology to 

•J =3 

s fj 1 5 provide a system for choosing an optimal threshold value (t Q ) for the image or for 

;g thresholding an image. The system may comprise: a) an image device for providing a 

y digitized image; b) a data storage device; and c) a central processing unit for receiving 

; j the digitized image from the image device and which can write to and read from the 

3 

B a data storage device. The present invention also encompasses data storage media having 

20 recorded thereon software that upon installation in a computer and execution of the 

software will cause the computer to carry out a method disclosed herein for choosing an 
optimal threshold value (t Q ) for the image or for thresholding an image. 

The present method may be extended to three dimensions. In the three- 
dimensional case, it will be understood that mean feature size refers to volume rather 
25 than area. Furthermore, higher dimensional images and data compilations might also 
be analyzed by use of the methods according to the present invention. As will be 
apparent, thresholding according to the method of the present invention might be used 
for noise elimination, decision making or pattern identification in multi-dimensional 
data space. 

30 This invention is useful in the automated reading of optical information, 

particularly in the automated reading of a matrix of sample points on a tray, slide, or 
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suchlike, which may be comprised in automated analytical processes like DNA 
detection or typing. 

It will be understood that these exemplary embodiments in no way limit the 
scope of the invention. Steps in the thresholding process do not necessarily need to be 
performed strictly in the order described here; any order that accomplishes the same 
goals would work. Other modifications of the invention will be apparent to those 
skilled in the art in view of the foregoing description. These descriptions are intended to 
merely to provide specific examples of embodiments which clearly disclose the present 
invention. Accordingly, the invention is not limited to the described embodiments or to 
the use of specific elements, dimensions, materials or configurations contained therein. 
All alternative modifications and variations of the present invention which fall within 
the spirit and scope of the appended claims are covered. All publications and patents 
are herein incorporated by reference to the same extent as if each individual publication 
or patent was specifically and individually indicated to be incorporated by reference. 
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